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Abstract 

We give an exact solution to the Kolmogorov equation describing genetic drift for 
an arbitrary number of alleles at a given locus. This is achieved by finding a change 
of variable which makes the equation separable, and therefore reduces the problem 
with an arbitrary number of alleles to the solution of a set of equations that are 
essentially no more complicated than that found in the two-allele case. The same 
change of variable also renders the Kolmogorov equation with the effect of muta- 
tions added separable, as long as the mutation matrix has equal entries in each row. 
Thus this case can also be solved exactly for an arbitrary number of alleles. The 
general solution, which is in the form of a probability distribution, is in agreement 
with the previously known results — which were for the cases of two and three alleles 
only. Results are also given for a wide range of other quantities of interest, such 
as the probabilities of extinction of various numbers of alleles, mean times to these 
extinctions, and the means and variances of the allele frequencies. To aid dissemi- 
nation, these results are presented in two stages: first of all they are given without 
derivations and too much mathematical detail, and then subsequently derivations 
and a more technical discussion are provided. 

Key words: Population genetics, diffusion, Kolmogorov equation, genetic drift, 
single-locus. 



1 Introduction 



Although probabilistic ideas are at the heart of genetics, and have been used 
since the earliest days of the subject, it was in the study of genetic drift in the 
1920s and 1930s that the notion of stochastic processes first played a major 
part in the theory of genetics. In the simplest diploid population of a 
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fixed size, A^, was considered, and attention was focused on two alleles at a 
particular locus. If selective differences between the two alleles, as well as the 
chances of mutation, are ignored, then the changes in the allele frequencies 
are purely due to genetic drift. Assuming discrete non-overlapping generations, 
one can ask: what is the probability that n of the 2N genes in the population 
alive at time t are of a given type? This formulation of genetic drift is usually 
referred to as the Fisher- Wright model, being used implicitly by Fisher [1], 
and explicitly by Wright [2]. 

Although Fisher and Wright did not use this terminology, the stochastic pro- 
cess they defined through this model is a Markov chain, since the probabili- 
ties of genetic change from one generation to the next do not depend on the 
changes made in previous generations. However the use of Markov chains be- 
comes cumbersome when the effects of mutation and selection are introduced, 
and for this reason there was a move away from the description of the process 
in terms of a discrete number of alleles n = 0,1..., 2A^, and discrete gener- 
ations, to a process of diffusion where the fraction of alleles of one type is a 
real random variable x{t) in the interval [0, 1] and time is continuous. This 
type of model actually predates the Markov chain description [3], and was 
further studied by Wright [4] and Kimura [5]. It is this formulation that will 
concern us here, specifically we will be interested in neutral evolution — that 
is, the dynamics of a randomly mating population which may, or may not, be 
subject to genetic mutation. 

Our motivation for the work presented here lies in the study of an evolutionary 
model of language change [6,7,8,9]. As a first step in the mathematical formu- 
lation of this model, we developed a "drift model" of language change [10], 
which turns out to be identical to the diffusion models of population genetics 
discussed above. Particularly relevant in this application to language change 
is the situation where the number of variants (the linguistic counterpart to 
alleles) may be large. A survey of the population genetics literature revealed 
a considerable amount of work on the diffusion equation for two alleles [11], 
some on three alleles [12] and very little on the general case of an arbitrary 
number of alleles [13]. Much of the work originated with Kimura in the 1950s, 
and developments since then appear scattered throughout the literature. 

In this work we fill a considerable gap in the literature by giving for the 
first time an analytic solution of the diffusion equation, with and without 
mutation, in the general case of M alleles. Given the rather scattered nature 
and — occasionally — the apparent obscurity of other results relating to genetic 
diffusion at a single locus, we take the opportunity here to present them in 
a systematic fashion. In writing this paper we also have a third aim, namely 
to present this information in such a way that is accessible to geneticists who 
may not wish to work through a great deal of mathematical analysis. We 
will shortly describe the plan of the paper which we hope will allow us to 
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accomplish this. 



After Kimura's pioneering work in the 1950s, there was further work by him 
[14,15,16,17] and others [18,19,20,21] on single-locus models, but interest nat- 
urally moved to models which involved multiple loci and the interactions be- 
tween them [22,23,24,25,26,27]. Despite the continuing development of pop- 
ulation genetics, very little has been achieved by way of exact solutions for 
single-locus models since Kimura [12] and Littler [18,19]. Occasional work has 
appeared, both of an analytic nature [28,29,30,31] and involving numerical 
simulations [32,33,34,35], although single locus diffusion models continue to 
receive proportionately less attention. 

A number of other factors have been responsible for the relative neglect of 
this particular area of population genetics. One of the main ones is that the 
focus has shifted to experimental investigations and there has been the conse- 
quential realisation that reality is very much more complex than what simple 
models allow. However, it is also the case that the theory is perceived as math- 
ematically difficult. Even Kimura [12] states when studying the extension to 
three alleles that for the partial differential equation describing diffusion "the 
general case of an arbitrary number of alleles can be solved [by separation of 
variables]. However, additional techniques will be needed to make the math- 
ematical manipulations manageable". We will show here that this is not the 
case: in fact it is no harder to solve the case of the diffusion of a general 
number of alleles than the three-allele case. 

We believe that, while our motivation for this work originates with models of 
language change, geneticists will also find the results we obtain useful. The 
diffusion theory is described in several standard texts on population genet- 
ics [36,37,38,39] and general consequences of the results previously found in 
the form of simple rules are widely known and utilised. The problem, as we 
have already indicated, is the mathematical nature of the theory. The diffu- 
sion approximation in the case without mutation results in a partial differential 
equation which is of the type which is used to describe the diffusion of heat 
in a metal or dye in a fluid [37]. The inclusion of mutation corresponds in 
these physical problems to a deterministic force which biases the diffusion 
in one direction rather than another. For example, if the particles diffusing 
were charged, then this deterministic force could arise from an electric fleld 
being imposed on the system. Not surprisingly, diffusion equations of this 
kind have been extensively studied by physicists who call them Fokker-Planck 
equations [40,41]. Here we will use the nomenclature used in the genetics liter- 
ature where they are called Kolmogorov equations. There are other confusing 
changes in nomenclature across disciplines: for example, the term represent- 
ing the deterministic motion discussed above in the Kolmogorov equation is 
called the drift in the physics literature, but in the genetics literature it is the 
diffusive term which is called the drift. In order to avoid confusion we shall 
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now drop the further use of the term "drift" in this work. 

The outhne of the paper is as follows. In section 2 we will give on overview of 
our results which is devoid of proofs and too much mathematical detail, so that 
it may be read by those who simply wish to have a reasonably non-technical 
summary of the results of the paper. It will also serve as a general introduction 
to those who wish to go on and absorb the more technical sections of the paper. 
These technical sections start with section 3, where the general solution of the 
Kolmogorov equation for an arbitrary number of alleles, with and without 
mutations, is derived. In section 4 we give details of the calculation of various 
quantities of interest that are given briefly at the end of section 2. We end 
in section 5 with an overview of the paper's main results and point out some 
remaining unsolved cases. Further technical material has been relegated to 
three appendices so as not to interrupt the flow of the arguments in the main 
text. 



2 Overview of the main results 

In this section we will describe the problems we investigate, explaining our 
methods of analysing them and give the results we have obtained without 
proof. 

2. 1 Definition of the model 

The simplest, and most widely studied, version of the problems we will be 
investigating in this paper, is where two alleles, ai and 02, are segregating in a 
randomly-mating population of monoecious individuals. We begin with the 
purely random case in which the only way that allele frequencies may change 
is through the random sampling of gametes in the reproductive process. If x 
is the frequency of allele ai and 1 — x the frequency of allele 02, we would 
expect that the value of x would change through time until eventually one or 
other of the alleles would become fixed, i.e. x would take on the value or 
1. The mathematical quantity which describes the process is the conditional 
probability distribution function (pdf) P(x, t|xo,0) — the probability that the 
frequency of the ai allele at time t is x, given the frequency of this allele at 
t = was Xq. It satisfies the Kolmogorov equation [11,36,37] 

^ = ^[^(^)^]> D{x) = \x{l~x), (1) 

in the interval < x < 1. This is a diffusion equation for the frequency of the Oi 
allele, but with a diffusion coefficient, D{x), which depends on the frequency. 
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The form of the diffusion coefficient is such that it gets much weaker as x 
approaches the boundaries at x = and x = 1. However — as we show using 
a simple argument in Appendix A — it is still sufficiently strong that fixation 
can occur, but once this has happened there is no possibility of re-entering the 
region < x < 1. (In Feller's classification [42], these are exit boundaries). As 
we shall see, it will be necessary to consider the boundaries separately from 
the interior region when calculating probabilities. 

It should also be noted that in the derivation of Eq. (1) (see Section 3), one unit 
of time is taken to be 2N generations, so that the time between generations 
is 1/(2A^). Although this choice is useful in derivations involving the diffusion 
approximation, it is frequently more useful to work with r = 2Nt, so that the 
unit of time is that between generations. Then Eq. (1) takes the form 

dP 1 92 

which is the one that generally appears in textbooks [11]. 

When M alleles ai, 02, . . . , are present, these ideas can be generalised. If 
Xi,X2, . . . ,xm are the frequencies of the alleles, then there are only M — 1 
independent variables — the frequency xm niay be replaced by 1 — Xi — . . . — 
xm-i- The Kolmogorov equation is now an (M — l)-dimensional diffusion 
equation [13] 

op A/-1J\/-1 02 

ar=s: Ea^iwi. (3) 

where the diffusion coefficient is now a matrix: 

A.(x) = 0';'^'~'^^^' ' ' (4) 




If mutations occurring at constant rates are allowed for, these Kolmogorov 
equations acquire an extra term, which involves only first order derivatives. 
In the case of two alleles, if 02 mutates to at a constant rate given by mi 
and a\ mutates to a2 at a constant rate given by m2, then in the absence 
of the random process described above, we would expect the frequency of 
the a\ allele to change according to the deterministic differential equation 
X = mi(l — x) — m2X, where x = If we include both this deterministic 
process and the random diffusion process, the Kolmogorov equation takes the 
form [36]: 

where ^ 

A{x) = mi(l — x) — m2X , D{x) = -x(l — x) . (6) 
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Finally, if M alleles are present with mutation, we shall in common with other 
authors [43], only study the situation in which the rate of mutation of the 
alleles aj {j ^ i) to occurs at a constant rate m^, independent of j. Then 
the deterministic equation would be 

Xi = mi{l - Xi) rrijXi = - ^ mjXi = Ai{x) . (7) 

The fact that depends only on Xi is a consequence of mutation rates 

depending only on the end product of the mutation. It is this feature that 
allows solution by separation of variables of the resulting Kolmogorov equation 
which reads 

op M-l Q M-1A[-1 o2 

^--T.^^\M:m^T.^^^\D,M)P], (8) 

where Ai{x) and Dij{x) are given by Eqs. (7) and (4) respectively. 

It is sometimes useful to use the backward form of the Kolmogorov equation, 
in which the derivatives are with respect to the initial values Xq and to: 

a M-l Q Af-lM-l q2 

(9) 

Since for the processes we will be interested in here, P{x,t\xQ,to) will be a 
function only of {t — to), the left-hand side may also be written as —dP/dt, 
and to chosen to be equal to zero. A detailed derivation and explanation of the 
relationship between the forward and backward forms can be found in [40]. 

Note that immigration is commonly introduced [38] by the use of A{x) = 
m{x — x) where x represents the mean of the population where the immigrants 
come from. This situation can be represented by Eq. (5) simply by setting 
mi = mx and m2 = m{l — x). When one or more of the mutation parameters 
is zero, we have irreversible mutation, when the final result is fixation to those 
alleles which have mj not equal to zero. We will not consider this special case 
here, but see for example [44,45]. 

In purely mathematical terms, the purpose of this paper is to obtain the 
conditional pdf -P(x, t|xo, 0) obtained by solving Eqs. (3) and (8), subject to 
the initial conditions that Xi = Xi^ at t = and the appropriate boundary 
conditions. As far as we are aware, only the cases M = 2, 3 without mutation 
and M = 2 with mutation have been solved to date [11]. The key to making 
progress is to find a change of variable which makes these equations separable 
and also to note that the solution with n alleles is nested within the solution 
with n + 1 alleles, so many properties follow by induction on n. 
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2.2 Change of variables 



While the M = 2 Kolmogorov equations (1) and (5) can be solved by separa- 
tion of variables — writing P{x,t) = X{x)T{t) — their counterparts for general 
M, Eqs. (3) and (8) cannot. The standard way to proceed in such a situation 
is to look for a transformation to a coordinate system in which these partial 
differential equations are separable. Kimura [12] gave such a transformation 
for values of M up to 4 without mutation, but not in general. There is no sys- 
tematic way to discover such transformations, but we have found one which 
works for all M, namely: 

u^ = -^ , t = l,...,M-l, (10) 

with the inverse transformation 

Xi = Ui Y[{1 - Uj) . (11) 

j<i 

As discussed in section 3, this change of variables means that the most general 
form of the Kolmogorov equation we consider, Eq. (8), may be written in 
diagonal form: 

QT> M~l Q M-l Q2 

ar-ge^l-4.fe)P|+|:3^W]. (12) 

that is, there are no mixed second-order derivatives. For completeness we give 
the expressions for the functions Ai{u) and Vi{u): 

Afa) J"-: "')"■}, p.(M)4 . (13) 



n,<.(i-%) " 2n,o(i 



u 



Note that this equation has been derived using the new variables — ^just chang- 
ing variables using (10) in the original Kolmogorov equation (8) will not give 
(12), but will instead simply give an equation for P(a;, t|xo, 0) with Xi replaced 
by UiY{j<i{^ ~ Uj). The new conditional pdf in (12), denoted by V{u,t\uQ,0), 
is related to P{x,t\xQ,0) through the Jacobian of the transformation to the 
new variables: 



Vi^u, t\uQ, 0) duidu2 . . . duj\/_i = P{x, t|xo, 0) dxidx2 . . .dx 



M-l 



V{u, t|Mo, 0) = P{x, t\xo, 0) ^i-'^---^'^^-^) . (14) 

d{ui, . . . ,mm-i) 
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The Jacobian is calculated in Appendix B. When M = 3 it is simply (1 —Ui), 
so that 

V{ui,U2, t|Ml,0, M2,0, 0) = (1 - Ml) P(xi, X2, t|xi,o, 3:2,0, 0) , (15) 

and in the general case one finds that 

Viu, t|Mo, 0) = (1 - uif~^ (1 - U2)''~' ... (1 - um^2) Fix, t|xo, 0) . (16) 

We may summarise this subsection in the following way. We have given a 
change of variables which allows the Kolmogorov equation to be written in a 
form which is separable. It is easier to work with the conditional pdf appro- 
priate to these variables, and then to transform back to the original variables 
of the problem and the original conditional pdf using Eqs. (10) and (16) re- 
spectively. 



2.3 Expressions for the conditional probability distribution functions 



The details of the solution of the Kolmogorov equation (12) are given in section 
3 and Appendix C. Here we will present only the final expressions. 

(i) M = 2, no mutations. For orientation, let us first give the expression in the 



case M = 2 and where no mutation is present. This is well known and appears 
in the standard texts [11]. In this case no change of variable is required: u = x. 
Then 

oo r 1 1 

P(x, t\xo, 0) = xo(l - xo) E ciW - 2xo)FKl " 2x) exp --(/ + 1)(/ + 2)t \ . 

1=0 i z ) 

(17) 

The constant q is given by q = [(2/ -|- 3)(/ + 2)]/(/ + 1) and the Fi are 
polynomials, described below. The function (17) satisfies both the initial and 
boundary conditions and holds for all x in the interval (0, 1), that is, excluding 
the boundaries at x = and x = 1. For not too small values of t, the solution 
is well approximated by keeping only the first few terms in Z. In these cases 
the polynomials have a simple form, since Fi [z) is a polynomial in z of order 
/. For instance, 

Fo(^) = l, F^iz) = 2z, F^iz) = -^{l - 5z') , . . . . (18) 

The natural variable for these polynomials is ^ = 1 — 2x, so that they are de- 
fined on the interval — 1 < 2 < 1. For general / they are the Jacobi polynomials 
Pi'''\z) [46]. 
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Fig. 1. Time evolution of the pdf for a biallelic system as determined from the 
analytic solution with no mutation and initial condition xq = 0.7. 

We plot the time development of the solution (17) in Fig. 1. Beginning from a 
delta function at the initial point, the distribution initially spreads out until 
the boundaries are reached. The distribution is soon nearly flat, and subsides 
slowly as probability escapes to the boundaries. Note that the probability 
deposited on the boundaries is not shown here (but will be discussed later). 

(ii) General M, no mutations. For general M, the solution may be written in 
the form 

Viu, t|no, 0) = wiu^) $A(Mo)<I>A(M)e-^* . (19) 

A 

as discussed in Section 3. The solution is written in terms of the u variables 
since the problem is separable and so the eigenf unctions are separable. The 
function w{u) is given by 

M-1 

w{u) = n «i (1 - ' (20) 

and A depends on a set of M — 1 non-negative integers /i, ^2, • • • , ^m-i according 
to 

A = h^^i (Lm^i + 1) , L, = j2 (Ij + 1) • (21) 

This means that the sum over A in Eq. (19) is in fact an {M — 1)— dimensional 
sum. 

The property, which we have already remarked upon, that the system with 
M — 1 alleles is nested in that with M, manifests itself here in the fact that the 
functions $a are very closely related to the functions which already appear in 
the two allele solution (17). First of all, since the problem is separable in this 
coordinate system, they may be written as 

M-1 

<!>x{u,,U2,...,um-i) = n (22) 

i=l 
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Fig. 2. Time evolution of the pdf for a triallelic system as determined from the 
analytic solution with no mutation and initial condition xi^o = 2:2,0 = 0.33. 

Each of the factors ip^^^ in this product separately satisfies the same differential 
equation. We shall derive this equation in Section 3, from which we will learn 
that ip^^'^ depends only on Ui and on the integers k and Specifically, 

= [ciXL.-i)f' (1 - p(i,2L._.+i)^^ _ 2n,) . (23) 

Here the q. are the analogue of the q that appeared in Eq. (17), but in this 
case we have included them in the function $a, so that it satisfies the simple 
orthogonality relation 

Jduw{u)^x (m) $ A' (m) = ^ a A' • (24) 

The explicit form of these constants is 

,^ , (2/, + 3 + 2L,_i)(/i + 2 + 2Li„i) 
ciAL^^i) = • (25) 

The P(i'/5)('l — 2u), now with (3 = 2Lj_i + 1, are again Jacobi Polynomials [46]. 

The general solution for M > 2 has previously only been found for the case 
M = 3. We have checked that our solution agrees with the published result [12] 
in this case, although the labels on the alleles are permuted between Kimura's 
solution and ours [xi X3) due to a different change of variable being made. 
We plot this solution for a triallelic system in Fig. 2 at several different times. 
Just as in the two allele system, the distribution initially spreads out and forms 
a bell shape, which quickly collapses, and becomes nearly fiat, then subsides 
slowly as probability escapes to the boundaries. Note that the probability 
deposited on the boundaries is not shown here. Notice also the triangular 
shape of the region over which the distribution is supported. 



(iii) M = 2, with mutations. When mutation is present, the solution is again 
well known [11] for the M = 2 case. Again, no change of variable is required: 



10 



u = X. Then 



P(x,t|a:o,0) = 

^Qx°(l - x)^P/"'^^(l - 2xo)P/°'^Hl - 2a;) exp --/(2P + l-l)t \ . (26) 
i=o I 2 J 

The constant c/ is given by 

_ r(2P + /-i)r(/ + i) 

- r(2m, + 0r(2m2 + /)^^^ + " 
while a = 2mi — 1, f3 = 2m2 — 1 and the parameter R = mi + m2. 

The P^^'^^ are Jacobi polynomials [46]. Unlike the case without mutations, 
a ^ (3, and they cannot be written in terms of simpler polynomials as in 
case (i). The function (26) satisfies both the initial and boundary conditions 
and holds for all x in the interval [0, 1], that is, now including the boundaries 
at x = and x = 1. This difference is due to the nature of the boundary 
conditions in the mutation case, which will be described in detail in Section 
3. 



For not too small values of i, the solution is well approximated by keeping only 
the first few terms in /. In these cases the polynomials do have a relatively 
simple form, since Pi"'^\z) is a polynomial in z of order /. For instance, 

Pt^\z) = 1 , Pt^^\z) = ^[(a + P + 2)z + a-P],.... (27) 

Also note that once again the natural variable for these polynomials is z = 
1 — 2x, so that they are defined on the interval — 1 < 2; < 1. In terms of the 
allele ai frequency x and the mutation rates mi and m2, these polynomials 
take the form 

Po^""^-'''"'^-')(l - 2x) = 1 , p(2-i-i.2™2-i)(^ _ ^ 2[mi -Rx],.... (28) 



For concreteness, we plot in Figs. 3 and 4 the time evolution pdfs for biallelic 
(M = 2) systems from a fixed initial condition of Xq = 0.7, i.e., that 70% of the 
population have one particular allele type. In the former case, the mutation 
rates are low (mi = m2 = 0.2) and one finds a high probability of one allele 
dominating the population. Conversely, in Fig. 4 which has higher mutation 
rates mi = m2 = 0.8, the most probable outcome is for both alleles to coexist 
in the population, as indicated by the peak of the distribution occurring for 
some value of x far from either boundary. In each of the figures, we compare 
the distribution obtained from a Monte Carlo simulation of the population 
dynamics against the analytic result (26) with good agreement. 
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Fig. 3. Time evolution of the pdf for a biallelic system as determined from the an- 
alytic solution (left) and Monte Carlo simulation (right) with mutation parameters 
nil = 1712 = 0.2 and initial condition xq = 0.7. 





Fig. 4. Time evolution of the pdf for a biallelic systems as determined from the an- 
alytic solution (left) and Monte Carlo simulation (right) with mutation parameters 
mi = 1712 = 0.8 and initial condition xq = 0.7. 

(iv) General M, with mutations. For general M, the solution may be written 
in the form 

Viu, t\uo, 0)=Y^ exim)^xiu)e-^' . (29) 

A 

as discussed in Section 3. Here Oa(m) is the left-eigenf unction of the Kol- 
mogorov equation ($a(m) being the right-eigenfunction, as usual). Again, A 
depends on a set of M — 1 non-negative integers li, I2, ■ ■ ■ , h'l-i, with 

A = -Zm-1 {2Ri + - 1) , L, = J2^J^ Ri = J2m,. (30) 

j=l i=l 

The left-eigenfunction Qx{uq) is equal to $(Mo)/-Pst(Mo) [40], where Pstiuo) is 
the stationary pdf. This is equivalent to the form (19), appropriate when there 
are no mutations, since {Pst{Uo))~^ is what is there called the weight function 
(up to normalisation). Since the problem is separable in this coordinate system. 
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$ may again be written as 

M-l 

<l>A(ni, n2, . . . , UM-i) = n ^^'H^O • (31) 

i=l 

Once again each of the factors t/'^*'' is separately a solution of a single dif- 
ferential equation, and depends on Wj, the integers Im-i and Lm-i-i and the 
mutation rates mj. Specifically, 

^^'H^O = cfj_%r (1 - uO^^-^"-^ Pztf Hi - 2u,) , (32) 

where 



^Right 



^ \ r(2m,)r(2i?,+i: 



{21 



a, + /3i + l)T{lM^i + l)r(/A/-i + + A + 1) 



^(/A/-^ + «^ + l)^(/A/-. + A + l) 



a,- = 2m,- 



A — 2-Ri+i + 2LA/_j_i — 1 



M 



(33) 

(34) 
(35) 

(36) 



In a similar way, Qx may be written as 

ex{u,,U2,...,UM^i)= n ^^'Hwi)- (37) 

i=l 

Each depends on Ui and on the integer Lu-i-i and the mutation constants 
Ri and mj. Specifically, 

^^^n«.) = (1 - u.f''-^ Ptf^ - 2^^) . (38) 

where 



„Left 



T{2mi)T{2R, 



' \ T{2Ri 



X 



{2lM-^ + a^ + Pi + l))T{lM-i + l)T{lM-i + Uj + + I) 
T{lM-i + ai + l)T{lM-^ + A + 1) 



(39) 



The general solution has previously only been found for the case M = 2. Our 
solution agrees with the published result [12] in this case. 
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2-4 Evolution of the population 



We can calculate the time evolution of various statistics of the population. 
As previously, we present here a summary of the results for ease of reference, 
deferring detailed calculations to Section 4. 



2.4.1 Fixation and extinction 

In the absence of mutation, alleles can become extinct since, once they have 
vanished from the population, there is no mechanism by which they can be 
replaced. As we have already mentioned, the solution we have derived (19) 
gives the pdf for all states in the interior. The pdf on a boundary is also given 
by this equation, so long as this boundary is not the point of fixation of the 
final allele, since it corresponds to the solution with M reduced by 1. So, for 
example, one can read off the pdf in, say, the two-dimensional subspace where 
alleles ai and 02 coexist and all others present in the initial population have 
become extinct since it is just the M = 2 solution. Kimura [13] finds the large- 
time form for this quantity given an initial condition comprising M = 3 alleles 
using an approach involving moments of the distribution which, as we shall 
show below, can be found directly from the Kolmogorov equation without 
recourse to the full solution we have obtained here. 

It remains then to determine the fixation probabilities that are excluded from 
the pdf we have derived. Using a simple argument (outlined below) one can find 
the probability that a particular allele has fixed by time t by reduction to an 
equivalent two-allele description, the properties of which are well-known [13]. 
Using the backward Kolmogorov equation (9), one can find further quantities 
relating to extinction events. Although some of the formulae we quote have 
appeared in the literature before [18,25] they seem not to be widely known 
and some were stated without proof. We therefore feel there is some value 
in reviving them here and giving what is hopefully a clear derivation in this 
section or in section 4. 

Fixation. Define /i(xo,t) as the probability that allele ai, which had initial 
frequency xq in a two allele system, has become fixed by time t. This is given 
by [13] 

-| 00 

/i(xo, t) = Xo-- im - 2xo) - Pz+2(1 - 2xo)] e-('+^)('+2)*/2 . (40) 

^ 1=0 

Here Pi{z) is a Legendre polynomial which is a particular case of the Jacobi 
polynomials: Pi{z) = Pi^'^\z). Conversely, the probability that ai has become 
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extinct by time t is equal to that of 02 become fixed, viz, 



/2(xo, t) = (1 - xo) - - E W - 2x0) - Pi+2il - 2xo)] e 



il+l)(l+2)t/2 



(41) 



Now, consider an initial condition with M > 2 alleles, divided into two groups 
X and Y. Let now x be the sum of the frequencies of X alleles. We may think 
of the set of alleles X as a single allele in a two allele system (for instance, ai in 
the above discussion) and the set Y as the other allele. Then the probability 
that all of the Y alleles (and possibly some of the X alleles) have become 
extinct by time t is simply /i(xo, t) where xq is the initial combined frequency 
of X alleles. Similarly, the probability that all of the X alleles are extinct at 
time t, leaving some combination of Y alleles is f2{xo,t). Arguments of this 
kind, where a set of alleles is identified with a single allele in the M = 2 
problem, can be frequently employed to obtain results for the general case of 
M alleles in terms of those already explicitly calculated for M = 2. 

Coexistence probability. By combining the above reduction to an equivalent 
two-allele problem with combinatorial arguments, the probability that exactly 
r alleles coexist at time t can be calculated [18]. The result is 



where r can take any value from 1 up to M, and the second summation is 
over all subsets of size s drawn from the M alleles. It is a simple matter to 
use this formula to calculate the mean number of coexisting alleles at time t. 

In Fig. 5, we compare the time-dependent probabilities for 1 < r < 4 alleles 
to be present in a system initially comprising M = 4 alleles given by the exact 
formula (42) and as obtained from a Monte-Carlo simulation with the allele 
frequencies initially equal. We also show the evolution of the mean number of 
coexisting alleles from starts with M = 3 and M = 4 alleles. In both cases, 
the exact formula correspond extremely well with the simulations, illustrating 
the utility of the diffusion approximation to the discrete population dynamics. 

Mean time to the rth extinction. By solving the appropriate backward Kol- 
mogorov equation, one can straightforwardly find the mean time to the first 
(and only) extinction event among two alleles to be 



A combinatorial analysis, performed in [18] and which we will summarise in 
Section 4.1, reveals the mean time to the r*^ extinction can be calculated from 




(42) 



r{xo) 



2 [xo In xo + (1 - Xq) ln(l - Xq)] . 



(43) 
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Fig. 5. Probabilities that exactly r alleles are present as a function of time for 
r = 1,2,3,4 in a system with 4 alleles initially. Expected number of alleles present 
as a function of time for systems with 3 and 4 alleles initially present (right). Symbols 
are Monte Carlo simulation data, solid curves are analytic calculations. 

an M = 2 result. One finds 

^^"^ /s - 1\ 

Tr{M) = -2j2 (-1)'"' _ J E[(^n,0 + • • • + Xi„o) ln(x,,,0 + . . . + Xi^,o)] , 
s=r Y -'-/ 

(44) 

where Xi^^^Q is the initial fraction of allele present in the system. The second 
sum is over all possible s-subsets {ii,i2, ■ ■ ■ ,is} of the M alleles initially present 
in the population. 



Probability of a particular sequence of extinctions. Not all extinction proba- 
bilities can be calculated by reduction to an equivalent a two-allele problem. 
A notable example is the probability that allele goes extinct first, followed 
by ajj, aj3, . . . , aij^j_-^ leaving only aj^,^ in the final population, since in this case 
we ask for all, rather than just some of a subset to be present in the population 
at a given time. This probability is given by 

O ■ ■ (x)= X- ^H/-2,o 



XlAlfi 1 Xij^jfi Xij^j_-^fi 

Xi2,0 



XiM,0 •^iA/-i,o ' ' ' -^iSjO 



(45) 



in which Xi^Q is the initial frequency of allele a, in the population. This result 
appears in [19], albeit without an explicit derivation. We shall provide one in 
Section 4.1. 



First-passage probability. An immediate consequence of the previous result 
is the probability that a particular allele, at is the first to become extinct, 
that is, at a time when all other alleles have a nonzero frequency. This is 
obtained by summing (45) over all possible sequences of extinctions in which 
Oj goes extinct first. For example, when the initial number of alleles M = 3, 
the probability that ai goes extinct first can be found by summing over the 
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cases (ii, i2, is) = (1, 2, 3), (1, 3, 2) in (45). One finds 



Qi{xi o, X2fl) — X2,oz ■ 1- ^3,0:; ' , 

1 - X2,0 1 - X3,o 

where x^^ = 1 — xi^o — X2,o- This agrees with the result in [18]. By a similar 
method, one could determine the probability for allele Oj to be the second to 
go extinct or any other quantity of this type. 

2.4-2 Stationary distribution 

When mutations are absent, we see from (40) that allele fixes with a prob- 
ability equal to its initial frequency in the population. Hence in this case, the 
stationary distribution 

P*{x) = lim P(x,t|xo,0) (46) 

t — ^oo 

is zero everywhere except at those points that correspond to fixation of a single 
allele. 

On the other hand, when all the mutation rates mj at which all alleles aj,j ^ i 
mutate to are nonzero, the stationary distribution is nonzero everywhere. 
Furthermore, this distribution is reached from any initial condition and takes 
the well-known form [4,38] 

M 2mi-l 

^•W^^'^^'Hr^. (47) 

where in keeping with the 2 allele case R = Ri = Y^fLi'^'j- Recall that the 
frequency xm is implied through the normalisation Y^fLi Xi = 1. 



2.4-3 Moments and Heterozygosity 

One way to calculate moments of the distribution is to integrate the exact 
solution we have obtained above. However, it is also possible to derive differ- 
ential equations for the moments in terms of lower-order moments from the 
Kolmogorov equation, a procedure that leads more quickly to simple, exact 
expressions for low-order moments. We quote some examples for general M 
here, noting that previously results for M = 2 and 3 without mutations have 
previously appeared in [13,11] and for M = 2 with mutations in [47,38]. It is 
also worth remarking that by calculating moments, Kimura reconstructed the 
solution for the pdf for the case M = 3 and no mutations [13]. 

No mutations. When mutations are absent, the mean of Xi is conserved by the 
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dynamics, i.e., 

{Xi{t)) = Xi,o • 

Meanwhile, the variance changes with time as 



(48) 



= a;,,o(l-a;,,o)(l-e-*) . (49) 

This result is clear, at least in the case M = 2. The delta-function initial 
condition has zero variance, and the stationary distribution comprises delta- 
functions at X = and x = 1 with weights (1 — Xq) and Xq respectively. For the 
latter one then has (x^) = (1 — xq) ■ + xq ■ 1 = xq and so ((x^)) = xo(l — xq). 

The covariance, which does not appear in the two allele system is: 

{{xiXj)) = (xiXj) - {xi{t)){xj{t)) = XifiXj^o{e~^ - 1) . 



With mutations. Crow and Kimura [47] give moments only for the biallelic 
case, and then in terms of hypergeometric functions. We provide simpler re- 
sults in explicit form here, again for any M. The mean frequency of any allele 
behaves as 

{xM = ^ + (x,o - ^) e-«* = v. + Oe-^* , (50) 

where 

rrii rui 
Vi = Ci = Xi,o - . (51) 

The mean (50) exponentially approaches the fixed point of the determinis- 
tic motion. Recalling that R = Y^iTrii, we see in Eq. (5) the function ^^(x) 
vanishes at the point x = rrii/R Vi. 

The variance is more complicated: 

//T-^r+^W - ''^'^^ ~ _L 0(1 - '^Vi) m _ ^2 -2Rt 

{{Xi[t))) - ^2i?+i) + (i? + i) ^ 

rr^(lj-^ 0(lj2M A,-{2R+i)t 
l(2i? + l) + (R+l) ^^/' • 

As t — > oo, this converges to the limit 

M_.^.-|Z^. ,53, 



In Fig. 6 we show the mean and variance as a function of time in the biallelic 
case, for three different pairs of mutation parameters. The mean approaches 
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Fig. 6. Solid lines: Time evolution of the mean (left) and variance (right) for a bial- 
lelic system for three different mutation parameter combinations, and with xq = 0.8. 
Dashed lines: Evolution of mean and variance when mutation is absent. 




m^=0.7 m^=0.7 



m^=0.2 m^=0.7 



m^=0.1 m,=0.1 



Fig. 7. The time development of the heterozygosity with two alleles, for the case 
with no mutations (dashed curve) and for three different cases of mutation (solid 
curves), xq = 0.8 in each case. 

the stationary distribution value exponentially, at a faster rate the larger the 
values of the mutation parameters. When mi = m2, the variance rises asymp- 
totically to the stationary limit, again faster when the mutation parameters 
are larger. Notice that for small values of mi and m2, the final variance is 
close to that when no mutations occur, while the larger the mutation param- 
eters become, the narrower the final distribution. When the parameters are 
unequal, the variance can reach a maximum value before descending slowly to 
the final limit. 



The covariance (with i ^ j) is 



{{XiXj)) — {XiXj) {Xi){Xj) 

ViVj _ iViCj + VjCi 
{2R + 1) 

, [ Vim 

\{2R+l) ' {R+l) 



{R+l) 

iviCj + VjCi 



(54) 
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Heterozygosity. As asserted by Kimura [13], in the absence of mutation the 
heterozygosity decays as e~* regardless of the initial number of alleles. The 
heterozygosity is the probability that two alleles chosen at random from the 
population will be different. When there are multiple alleles, this probability 
can be calculated from the first and second moments using 

H{t) = Y.m^^) - i^l)) - T^i^^x,)] . (55) 

When expressions for the various moments are substituted into this expression, 
it is found that, in the absence of mutations, the constant terms cancel, leaving 
only terms decaying as e~*. Therefore we have 

H{xo,t) = Hoe-' , (56) 

where we find Hq = 2J2i{xi^o — x^q) — J^iJ^j^ti^ifiXj^. Kimura derived this 
result from the expected change in H with each generation. 

The calculation with mutation present, however, does not give such a simple 
result. The heterozygosity can also be found for any number of alleles when 
mutation is present. However, the form becomes increasingly complicated as 
M increases. Therefore we give as an example only the two allele result here: 



H{xo,t)=2{x{l-x)) 
4mim2 
~ R{2R+1) ' 



fm2 - 








1 / 


- -r) 



Xn 



2mi 



R 



Xo 



mi 



2R+1 



-(2_R+l)t 



(57) 



This is plotted in Fig. 7. We see that the presence of mutation between the 
two alleles maintains the heterozygosity at a finite level. 



3 Solution of the Kolmogorov equation 



In this section we will derive the solution of the Kolmogorov equation (12), 
which was given in Section 2.3. In particular we will derive the differential 
equation that is satisfied by every single one of the factors ip^''\ui) that appear 
in the eigenfunction $a(m) in Eq- (22) which applies when no mutations occur. 
We shall also present the more general differential equation that must be solved 
to determine the corresponding factors in (31) that give the eigenf unctions 
when mutations are present. The more technical points of the discussion are 
relegated to Appendices B and C, in order to avoid interrupting the flow of 
the arguments. 
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3. 1 Change of variables 



It is worthwhile to recall briefly the derivation of the Kolmogorov equation 
itself, since it will turn out that the most efficient way of changing from the x 
variables to the u variables is go back to an early stage of this derivation. The 
starting point for the derivation is the Chapman-Kolmogorov equation which 
is one of the defining equations for Markov processes: 

P{x, t'laio, 0) = y" dx'P(x, t) P{x, t|xo, 0) . (58) 

If we take t' = t + At and expand 

P{x,t + At\x',t) (59) 

in powers of Axi{t) = Xi(t + At) — Xi{t), we find — in the limit At —>■ — the 
Kramers-Moyal expansion for P{x,t\xjQ,0) [40]: 

if =11 . \ {c^n...^M)P} . (60) 

ot ml dxi^ . . . dxi^^ 

Here the ajj...j^(x) are the jump moments defined through the relation 

{Ax,,{t)Ax,,it) . . . Ax,„(t))^(,)^^ = (x) At + 0{Atf . (61) 

The angle brackets are averages over realisations of the process x{t)] x{t) is a 
stochastic process, whereas Xg and x are simply the initial and final states. 

So far very little has been assumed other than the Markov property of the 
stochastic process. The Markov nature of the process follows from the fact 
that the sampling of gametes from the gamete pool depends only on its current 
state, and not on its past history. If we assume that no deterministic forces 
are present, only the random mating process, then (Axi-^^it)) x{t)=x = 0. If 
deterministic forces are present, then there will be an 0{At) contribution to 
{Axi-^(t))x(t)=x which is equal to (dxi^/dt)|dct At. Here (da;i^/dt)|det is the rate 
of change of a;,^ in the deterministic (A^ oo) limit. The higher jump moments 
reflect the statistics of the sampling process. In the case of two alleles this is 
binomial and so the second jump moment is proportional to x(l — x), which 
is the variance of this distribution. Within the diffusion approximation, a unit 
of time is taken to be 2N generations, so that the time between generations 
is At = 1/{2N). This implies that all jump moments higher than the second 
are higher order in At. This gives the results (5) and (6). For the case of 
general M, the multinomial distribution applies, which again means that the 
<^ii...im{si) m > 2 vanish, and the second order jump moment is given by 
(4), up to a factor of 2. 
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This is the briefest summary of the derivation of Eq. (8). We refer the reader 
to [40] for a fuller account. However, the above discussion is sufficient for our 
purposes, which is to note that the derivation involving Eqs. (58)-(61) may 
be carried out for any coordinate system, and specifically in the coordinate 
system u = {ui, U2, ■ ■ ■ , um-i) for which the Kolmogorov equation is separable. 
In this case the Kramers-Moyal expansion for 'P{u,t\uQ,Q) is 

where are the jump moments in the new coordinate system: 

(An,,(t)An,,(t) . . . An,^(t))„(,)^„ = a,,...,^{u)^t + 0{^tf . (63) 

The most straightforward way of deriving the Kolmogorov equation in the 
u variables is to start from Eq. (62) and determine the ai-^,,,i^{u) by relating 
them to the known Q;ij...i„^(x). This is carried out in Appendix B and the result 
is the equations (12) and (13). 



3.2 Separation of variables 



The next step in the solution of the Kolmogorov equation is to separate out 
the time and Ui coordinates. That is, we write the general solution as a linear 
combination of such solutions: 

V{u, t\uo. 0) = E «A(Mo)$A(li)e-^* , (64) 

A 

where the function $a(w) satisfies 

E ^J,_,^^ I- {(^^-^ - -0 + K(i - -0]) = -a$a(m) 

(65) 

and Ri = EfLirrij. We may apply initial conditions directly on (64): 

S{u-UQ) = '^ax{uQ)^x{u). (66) 

A 

By assuming the orthogonality relation 

J duw{u)^xiu)^x'iu) = 5 XX' , (67) 

where w{u) is an appropriate weight function, we then have that ax{u) = 
w(m)$a(m) and so 

V{u,t\uQ,0) = w{uo)Y,^x{uo)^x{u)e-^'- (68) 

A 



22 



To determine $a(w) we have in principle to solve the partial differential equa- 
tion (65) in n = M — 1 variables. We will now prove that this equation is 
separable in the u variables. For clarity we will start where mutation does not 
take place (that is, omit the factor {RiUi — rrii) in (65)) — the argument goes 
through in an identical fashion when this term is present, as we will discuss 
later. 

When n = 1 (M = 2) $a(i^) satisfies the equation 

1 d2 



2dnf 



u 



i(l - ni)$;,(i) (m)] = -AW$;,(i) (m) , (69) 



where we have introduced the superscript (1) on to the eigenvalue A to identify 
it as belonging to the n = 1 problem. This ordinary differential equation may 
be solved in a straightforward fashion as we discuss in the next subsection 
below. 

When n = 2 (M = 3) $a(w) satisfies the equation 

1 d 
11 d 

o ~^ TO ["2(1 - U2)^x(2) {ui, U2)] = -A(2)<I';,(2) (mi, U2) . (70) 

Z 1 — ZLi OIX2 

We look for a separable solution of the form $;)^(2) (ui, ^2) = ip{ui)(j)-x(i){u2) 
where 

1 d^ 

2 duf ^"^''^ ~ «2)0A(i)(m2)] = -A^^Va{1)(m2) , 

that is, (pxwiui) satisfies the equation corresponding to the n = 1 problem. 
It is straightforward to show that a separable solution exists if ip{ui) satisfies 
the equation 



1 d 



2 



2dM? 



Ml(l - Ui)^x(2).x(i){Ui)\ = j-A^^) + -^Y"!^} ^A(2);A(i)(Mi) • (71) 



We have introduced the subscript A*-^-*; A*^^-* on ^{ui) to show that it depends 
on both the eigenvalues of the n = 2 and the n = 1 problems. As we will see, 
it is a remarkable fact that the solution of the problem with M = n+1 alleles 
only involves the function if) appearing in Eq. (71). Since 0a(i)(^2) may also 
be written as ipxw-Q we may write the separable form of the solution in the 
n = 2 case as 

$A(2) (Wl, W2) = ^A(2);A(i) (Mi)V^A(i);o(w2) • (72) 

We can now prove that the general case of M = n + 1 alleles is separable as 
follows. We look for a solution of (65) of the form 

$;^(n) (Ul, U2, • • • , Un) = tlj{Ui)^x("-i) {U2, . . . , U„,) , 
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where ^x(.n-i){u2, ■ ■ ■ ,Un) is the solution of (65) with n — 1 alleles, but with 
the replacements Ui U2,U2 M3, . . . ,m„_i — ^ m„,. By explicit substitution 
into (65) it is not difficult to demonstrate this, and to find that ip{ui) satisfies 
Eq. (71), where now A*^^-* and A'-^-* are replaced by A'-"^ and A'-""^-* respectively. 
Therefore we have shown that 

^Xi") {Ul, M2, . . . , Un) = t/jxM;X(r^~i) (ui)$A("-i) (^2, • • • , U„) • (73) 

This shows that if the n — 1 allele problem is separable, then the n allele prob- 
lem is separable. But Eq. (72) shows that the n = 2 problem is separable, and 
so by induction the problem for general n is separable. The explicit solution 
is 

n 

^XM (Mi, M2, . . . , Mn) = n ^A("+i-');A("-') i'^i) ) (74) 

4 = 1 

where A'-"^ = and where ipX]X'{u) satisfies the equation 

l^Hl- u)^X;y{u)] = |-A + (Y^l ^A;A'(«) • (75) 

Thus we have finally shown that in order to obtain the eigenfunction "^x^") ("^i; • • 
it is necessary only to solve this single ordinary differential equation for gen- 
eral A and A'. The full eigenfunction is then just a product of solutions of this 
equation. 

A similar equation arises from an analogous argument when mutation is in- 
cluded, that is, the factor {RiUi — nii) is restored into Eq. (65): all that is 
required is to make the substitution 

Id Id 

in the above argument. The only difference is that now the function explicitly 
depends on i because of the existence of the terms Ri and mj. Thus the results 
corresponding to Eqs. (74) and (75) are: 

n 

$A(n)(Ul, U2, . . . , U„) = ^/'Ji + i_,).^(„_i)(Ui) , (76) 

i=l 

where V'a a'(^«) satisfies the equation 



[RiUi - mi) ipx\'{'^ 



1 d^ 



2dn2 



Ui{l - Ui)4j'^^x'iUi) 

X' 



Ui 



(77) 
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The rest of this section will be devoted to solving the ordinary differential 
equations (75) and (77) subject to the boundary conditions of the problem. 



3.3 Solution of the ordinary differential equations 



3.3.1 Reduction to a standard form 

To solve the ordinary differential equation (77) (which includes the less general 
Eq. (75) as a special case) it is useful to cast it in a standard form. To this 
end, we introduce the function fx\'{ui) by writing 

V'gk^o = (1 - , (78) 

where 7, is a constant which is to be appropriately chosen shortly. Substituting 
Eq. (78) into Eq. (77) one finds (dropping the subscripts and superscripts on 
/) 

n,(l - u,)f" + 2 [(1 - m,) + {R, - (7, + 2)n,] /' 

+ [2A - (7i + + 2) + 2i?,(7, + 1)] / 

[2A'-7,(7, + l) -27,(m,-i?,)] 



Ui 



/• (79) 



We will choose 7, so that the term involving (1 — Uj) in the denominator 
vanishes, that is, we ask that 

7i (7i + 1 + 2m, - 2R,) = 2X' . (80) 

This is a quadratic equation for 7,, but it is simple to see that if 7, is one 
solution, the second one is simply 7,' = 2Ri — 2mj — 1 — 7,. It will turn out 
that either choice leads to the same form for 'ipx-^yi'^i)- 

If we choose 7, to satisfy Eq. (80), then Eq. (79) has the form 

n,(l - u,)f" + [c-{a + b+ l)n,] f - abf = , (81) 

where 

c = 2(l-m,); a+b = 27,+3-2i?,; ab = (7,+l)(7i+2)-2i?,(7,+l)-2A . (82) 

The solution of the equation has now been reduced to a standard form, since 
Eq. (81) is the hypergeometric equation [46], whose solutions are hypergeo- 
metric functions F{a,b,c, z). The details of the analysis of this equation are 
given in Appendix C; here we only describe the general features and specific 
form of the solution, without going into their derivation. That is, we find the 
constants 7, and A. In the following discussion it is convenient to treat the 
cases where mutations do and do not occur separately. 
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3.3.2 Solution in the absence of mutations 

We begin with the situation where mutations are absent, and so the pdf is 
found by solving Eq. (81), but where now 

c = 2; a + b = 2-fi + 3; a6 = (7, + 1)(7, + 2) - 2A . (83) 

The 7j are chosen so that 7i(7i + 1) = 2A'. We begin the discussion, for 
orientation, by assuming that only two alleles are present. This amounts to 
solving Eq. (69) or Eq. (75) with A' = 0, and the solution is therefore the 
function ■?/'a(i);o('"i) = (p\w{ui). This function also appears in the solution for 
arbitrary M, given by Eq. (74), and so this result is also required in the general 
case. 

Since in this simple case of two alleles A' = 0, it follows that 71 = or 
7i = —1. As discussed in Appendix C, either choice gives the same solution, 
and we therefore take 71 = 0. We have now only to solve the second order 
ordinary differential equation (81) subject to boundary conditions at Mi = 
and Ml = 1. As is familiar in such problems [48], the general solution is first 
expressed as a linear combination of two independent solutions. One of these 
is ruled out by one of the boundary conditions (e.g. the one at ui = 0) and the 
application of the other boundary condition constrains a function of A to be 
an integer; in our case one finds a = —I, / = 0, 1, . . .. Details are in Appendix 
C, but we do wish to mention the nature of the boundary conditions here. 
Technically [42] these are exit boundary conditions which means that all the 
probability which diffuses to the boundary is "extracted" from the interval 
(0, 1). Naively, one might imagine that it is appropriate to impose absorbing 
boundary conditions, i.e., that the pdf vanishes at mi = and mi = 1. However, 
since the diffusion coefficient, defined in Eq. (1), vanishes on the boundaries 
the diffusion equation itself imposes some sort of absorption. A more careful 
mathematical analysis, such as the informal argument presented in Appendix 
A, reveals that the appropriate constraint on the eigenfunctions is that they 
should diverge less rapidly than a square-root at both boundaries. 

The result a = —I imposed by the boundary conditions together with the 
definitions (83) imply that 6 = Z + 3, a6 = —P — 31 and so 

A = ^(/ + l)(/ + 2) . (84) 

The solutions of the differential equation will be labelled by the integer I and 
consist of polynomials of order /. These have already been mentioned in Section 
2 (Eq. (17)), whilst more technical information can be found in Appendix C. 

Now we can move on to the solution for general M. As explained above it is 
sufficient to solve Eq. (75) where A' is given by the A found in the solution 
of the M — 1 allele problem. We have seen that for M = 2, this is given by 
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(84), and so this will be the A' for the M = 3 problem. We will see that this 
will have the general structure {L + 1)(L + 2)/2, where L is an integer, and 
therefore we take this form for A'. From the way that the M — 1 problem 
is embedded in the M allele problem, we always have only to solve the ui 
equation and so we need to determine 71. This is determined by the equation 
71(71 + 1) = 2A' from which we see that 71 = L + 1 or 71 = —L — 2. Again, 
both give the same result and so we take the former. Applying the boundary 
conditions again gives a = —I, with / = 0, 1, . . . — see Appendix C for the 
details. Using Eq. (83) one sees that b = 2L + I + 5 and so 

A = ^ (L + / + 2) (L + / + 3) = ^ (L' + 1) (L' + 2) , (85) 
where L' = L + I + 1. This justifies the choice for A'. 

In summary, we have found that A^^^ = {h + l)(/i + 2)/2, A^^^ = {h + h + 
2){h + /2 + 3)/2, . . .. If we define 

U^Y. + 1) , (86) 

then A*^'^ = Li[Li + l)/2, and we have recovered (21). The general solution is 
given by Eqs. (19)-(25) and discussed further in Appendix C. 



3.3.3 Solution with mutations 

When mutations are present we have to solve Eq. (81) subject to the general 
set of parameter values given by Eq. (82). While it is true that this slightly 
complicates the analysis as compared with that of the mutation-free case, 
the most significant change is the nature of the boundary conditions. The 
introduction of mutations in the way we have done here renders the boundaries 
reflecting, which are defined as having no probability current through them. 
The current Ji{u,t) satisfies the continuity equation [40,41] 

o-p M-l O T 

§-5:^-0. (87) 

1=1 ' 

where (compare with Eqs. (12) and (13)), 



Ji{u, t) = [Mu)V{u, t)] - — [Vi{u)V{u, t)] 

- n-(lr^ {k - V - 1 A K(i - um] . (88) 



27 



Using the separable form of the solution (76), and asking that the current is 
zero on the boundaries lead to the conditions 



or in terms of the solutions of the hypergeometric equation 



0, (89) 



Ui=0,l 



{[2 (RiU, - m,) + {1 - (7, + 2)u,}] (1 - u.^ fxk^M^ + 

=0- (90) 

* ) Ui=0,l 

As before, we will discuss the general aspect of the solution here, deferring 
technical details to Appendix C. We begin with the case of two alleles. 

The solution to the Kolmogorov equation when only two alleles are present 
has been given by Crow and Kimura [47,11] and in our case corresponds to 
the solution of Eq. (77) with A' = 0. It then follows from Eq. (80) that we 
may take 7i = (note that i takes on the value 1 in this case, since there is 
only one variable in the problem: Ui = x). It therefore follows from Eq. (82) 
that c = 2(1 - mi), a + b = 3 - 2Ri and ab = 2(1 - Ri - A). We begin by 
examining the nature of the solutions of Eq. (81) in the vicinity of ui = 0. 
Three separate cases have to be examined: (i) c not an integer, (ii) c = 1, 
and (iii) c = 0, —1, —2, . . .. When the boundary condition (90) is applied, one 
of the two independent solutions is ruled out. The application of the other 
boundary condition again imposes a condition on a. In Appendix C, we show 
that 

A = ^/(2i?i + /-l) , (91) 

where / is a non- negative integer and, as M = 2, i?i = mi + m2. The solutions 
of the differential equations are again labelled by an integer /, and once again 
turn out to be Jacobi Polynomials which have been given in Section 2. 

When determining the solution for general M, the eigenvalue A' in Eq. (77) is 
given by the eigenvalue A found in the M — 1 allele problem, as occurred in the 
special case of no mutations. However, more care is required here, since the 
iterative nature of the problem manifests itself in such a way that the solution 
with M alleles is determined in terms of a function ip{ui) and the solution 
with M — 1 alleles but with Uj Ui^i,mi mj+i and Ri Ri+i, where 
i = 1,2 . . . , M — 1. Therefore although A^^'' is given by Eq. (91), the A' we use 
for the M = 3, (n = 2) problem is actually A' = /i(2i?2 + h - l)/2. For the 
general case we will see the A' will have the general structure 

A' = h {2R2 + L-1) , (92) 
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where L is an integer. So having obtained the solution for M2, . . . , um from the 
M — 1 solution, we need to solve the i = 1 equation (77) where 71 is a solution 
of Eq. (80), that is, 

71 (71 + 1 + 2mi - 2Ri) = L {2R2 + L-1) . (93) 

This equation has two solutions: 71 = —L and 71 = 2R2 — I—L, and since both 
give the same result we take the former. The implementation of the boundary 
conditions is carried out in the same way as in the two allele case. This is 
discussed in Appendix C where it is shown that this implies that 

A^'^) = hn (2i?i + L„ - 1) , (94) 

where 

i M 

U^Y.h^ Ri = Y.^,- (95) 

Here the Ij are non-negative integers. The solutions are given explicitly by 
equations (29)- (38). 



4 Derivation of other results 



In the study of most stochastic systems, the quantities which are of interest and 
which are most easily calculated are the mean and variance of the distribution 
and also the stationary pdf which the pdf tends to at large times. This is still 
true in the problems we are considering in this paper, where these quantities 
can, in general, be rather easily obtained. Many are already known and have 
been given in Section 2.4 and their derivation will be briefly discussed in this 
section. 

We draw particular attention to the added subtleties that exist when no mu- 
tation is allowed. In this case the exact solutions obtained so far only hold 
in the open interval Ui G (0, 1) and do not include the boundaries = or 
Mj = 1. It is clear that with increasing time various alleles will become extinct 
and the stationary pdf will be concentrated entirely on the boundaries where 
it will accumulate. This is somewhat different to the usual picture of absorbing 
states where the probability is removed entirely from the system. In order to 
calculate moments of allele frequencies, for example, one must take care to 
add in the contributions from the boundaries to those obtained by averaging 
the frequencies over the pdfs we have so far determined. This procedure will 
be demonstrated later in this section. First, however, we calculate a few quan- 
tities of interest when the absence of a mutation process admits the fixation 
and extinction of alleles. 
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4-1 Fixation and extinction 



Fixation. The probability that one allele has fixed by a time t was first obtained 
by Kimura [13] by way of a moment formula for the distribution. Here we 
derive this quantity directly from the explicit solution when there are only 
two alleles. We have described in Section 2.4.1 how this can be extended to 
any number of alleles. 

The definition of the current (88) can be written as 

Id 1 1 dP 

J{x, t) = --— [x{l - x)P{x, t)] = --{l- 2x)P{x, t) - -x{l -x)^ (96) 

in which the function P{x, t) is the probability distribution excluding any 
boundary contributions. We find then at the boundary points x = and 

x = 1, 

J(0,t) = -ip(0,t) and J(l,t) = ^P(l,t). (97) 

We then find the probability for allele ai to have fixed by time t is the sum of 
the current at the boundary: 

/i(xo,t)= f J{l,t)dt = \ f P{l,t)dt (98) 
JO 2, Jo 

which can be evaluated by inserting the explicit expression (17). This proce- 
dure yields 

A(xo,t) =a;o(l-xo)E^(-l)'^/'''Hl-2:^o){l-e-('+^)('+^^^^^ . (99) 
1=0 ' + 

Using the identities (C.13) and (C.14) given in Appendix C this can be written 

-| oo 

/i(xo, t) = xo-- im - 2xo) - P^+2(l - 2xo)] e-('+^)('+2)*/^ (100) 

^ 1=0 

where Pi{z) is a Legendre polynomial, in accordance with the result of [13]. 
The probability that ai has been eliminated by time t is the same as the 
probability that 02 (with initial proportion 1 — xq) has become fixed. In other 
words, 

/2(a;o,t) = /i(l-xo,t) (101) 
= (1 - xo) - ^ E iW - 2x0) - P.+2(l - 2xo)] e-('+i)('+2)t/^102) 
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where we have used the fact that Pi{—z) = {—iyPi{z). 

Coexistence probabihty. As noted in Section 2.4, combinatorial arguments can 
be used to calculate the probability that exactly r alleles coexist at time t from 
the fixation probability just derived. To do this, divide the complete set of M 
alleles into two groups X and Y, the first containing a particular subset of r 
alleles. As argued in Section 2.4, the probability that all of the M — r alleles 
in Y have become extinct by time t, leaving some of those in X remaining is 
just /i(xo,t), where Xq is the total initial frequency of all alleles in X. 

To find the probability that all the alleles in X continue to coexist at time t, 
we must subtract from fi{xo,t) the probability that one or more of them has 
been eliminated. For example the probability that only the pair of alleles 
and aj coexist at time t is 

Xihj) = fliXifl + Xjfi,t) - fliXifl,t) - /l(Xj,o,t) , 

as shown by Kimura [13]. One then finds the probability that exactly two 
alleles remain at time t to be the sum of the previous expression over all 
distinct pairs of indices This gives 

n2{xo,t) = E/iKo + x„o,t) - (M- l)5]/i(x,,o,t) . (103) 

i<j i 

Similarly, the probability that only the triple of alleles Oj^, and a^g all 
coexist at time t is the probability fi{xij^fi + Xi^^ + Xi^^, t) that some subset of 
these alleles remains at time t, minus the probability that only any particular 
pair or single allele from a^^, and a^g coexist, that is, 

X(^l,^2,«3) = fl{Xi^fi+Xi2fi+Xi^fi)-[fi{Xi^fi+Xi2fi,t)-fi{Xi^fi,t)-fi{Xi^fi,t)] 
~ [/l (2^12,0 + 3;i3,0) ^) — fl{Xi2,0^t) — fli^isfi^'t)] 
- [fl{Xi^,0 + Xi^fl,t) - fl{Xi^fi,t) - /i(Xi3,o,t)] 

- fi{xi^fi,t) - fiixi^fi^t) - fi{xi^fl,t) . (104) 

Simplifying, we find that 

3 

X(^i,^2,i3) = -E(-l)' E fiixn,o + --- + Xj^,o,t), (105) 
*=i jeo-s(n,«2,«3) 

where ag is a subset of ii,i2, is with s elements and ji, . . . ,js are the elements 
of that subset. Proceeding in this way, one finds that the probability that the 
r alleles a^^ , . . . , a^^ all coexist at time t is given by 

Xitu ...,tr) = E(-l)'-' E fii^nfi + ■■■ + ^js,o, t) , (106) 

■5=1 jeas(ii,...,ir) 
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where now the second summation is over a subset of ii, . . . , v with s elements. 
The result (106) can be proved by induction, by beginning with the expression 
for • • • ; V+i) (compare with Eq. (104) which is the case r = 2): 

r 

x(ii, . . . , V+i) = /i(3;ii,oH ^Xr+i,o,t)-J2 J2 X(ji, • • • ,Jfc) • (107) 

fc=i ie(Tfe(ii,...,v+i) 

If we now assume the result (106) up to and including r alleles, then by 
substituting these expressions for x{ji^---yjk) into the right-hand side of 
Eq. (107), we obtain the expression (106), but with r replaced by r + 1. 
Since we have explicitly verified the result for low values of r, the result is 
proved. During the course of the proof the double summation is rearranged 
using Ylk=i^s=i ~^ Yll=iYlk=s ^nd the fact that a particular set (/i, ■ ■ ■ Js) 
appears ('^^i^'^) ^inies in the sum is used. 

The analogue of Eq. (103) in this case, namely the probability that exactly r 
alleles remain at time t is the sum of Eq. (106) over all distinct r-tuplets of 
indices ii, . . . ,ir- 

^rixo,t)= X(«i,---,v)- (108) 

iG(7r{n,.--,«A/) 

Substituting the result (106) into this equation, and using exactly the same 
manipulations that were required in the inductive proof of (106), gives the 
general result for r alleles stated earlier: Eq. (42); see also [18]. 



Mean time to the rth extinction. As is well-known (see, e.g., [39,41]) the mean 
time t{xq) to reach any boundary from an initial position Xg 

>CV(xo) = -1 (109) 

where is the backward Kolmogorov operator appearing in (9). The bound- 
ary conditions on t{xq) are that it vanish for any Xq corresponding to a bound- 
ary point. For the case M = 2 and no mutations, we seek the solution of 



1 d^ 

-a;o(l - a;o)^r(a;o) = -1 (110) 



that has r(0) = r(l) = 0. Two successive integration steps yield the required 
answer (43). 

A related quantity, which is obtained in a similar fashion, is the mean time to 
fixation of allele Oi, given that it does become fixed. This is given by 

, j;rdtt^Ji(xo,t) , , 
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since /i(xo, t) is the probability that allele Oi has become fixed by time t. Since 
/i(xo,t) — > Xo as t — s> oo, the denominator equals Xq. To find the numerator, 
we multiply the backward Kolmogorov equation by t and integrate over all t. 
Use of Eqs. (97) and (98) then shows that the numerator obeys the equation 

1 f°° d 

-Xo(l-Xo)^y^ dtt—fi{xo,t) = -xo. (112) 
This yields the result 

r-(xo) = - '^^-^°^^"^^-^°^ (113) 

Xo 

To find the mean time to the rth extinction event from an initial condition 
with M alleles, note that the probability Q^{M,t) that m or fewer alleles 
coexist at a time t can be found by summing (42) appropriately. One finds 

m 

Q<{M,t) = J2nr{t\x^) 

r=l 

= E E(-i)^-^ ; _ J E /i(^.,o + ■ ■ ■ + x..,o, t) 

r=l s=l \r S J 

= E(-ir"1 _ W:/i(x,,o + --- + x,,o,t), (114) 
s=i \ m s J 

where we have rearranged the double summation using YIk=i Ss=i ~^ S^^i Sfc^s 
and also used the identity Ef=o(-l)*^Q = (-l)^^(^xO- 

Differentiating Q^{M,t) with respect to time gives the probability that a 
state comprising m alleles is entered during the time interval [t,t + dt]. This 
corresponds to the rth extinction event, where r = M — m. Hence the mean 
time to this extinction, the first moment of the distribution, is given by 

r.(M)= E(-l)'^"-(^_'_ijEi dtt^/,(x,.o + --- + x..,o,t), 

(115) 

the denominator being unity, since limt^oo Qm(^5 ^) — 1- However, from 
Eqs. (Ill) and (113), 

POO 

/ dtt-/i(xo,t) = -2(l-a;o)ln(l-xo) . (116) 
JO dt 

This gives an expression for the function which we wish to evaluate in Eq. (115). 
Changing the summation variable from s to M — s, and so identifying 1 — xq 
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with Xii^o H \- Xi^,o gives 



M-l _ 



Tr{M) = -2 E(-l)'"1^_ J K,o + --- + a;i,,o]ln[x,,,o + --- + a;i,,o] • 



;il7) 



Probability of a particular sequence of extinctions. Let Qm,m-i,... ,2(210) be the 
probability that in the evolution allele M becomes extinct first, followed by 
allele M — 1, M — 2 and so on, ending with fixation of allele 1. Littler [19] 
found the result given in Eq. (45), but we give a derivation here. We define 
QM,...,2(t\xj[), 0) as the probability that this sequence of extinctions has occurred 
by time t. So for example, in the biallelic system (M = 2), Q2{t\xo,0) = 
/i(xo, t). Just as one can show that /i obeys the M = 2 backward Kolmogorov 
equation (by setting x equal to its value on the boundary x = 1 and relating 
/i to P(l, t|xo, 0)), then in general one can show that QM,...,2(^|i£o! ^0) satisfies 
the backward Kolmogorov equation (9) 

|:QA/,...,2(t|Xo, 0) = /:^gAf,...,2(tko, 0) . (118) 



The function (5M,Af-i,... ,2(210) is the stationary (i.e., t — > 00) solution of this 
equation that satisfies the following boundary conditions. First, Qm,... ,2(2105 0) = 
for any Xq that corresponds to any allele other than M already being ex- 
tinct. That is, for any Xo that has Xi^ = for any i < M. On the hyperplane 
xm,o = 0, we must have that (5m,...,2(^0! 0) equal to the probability of the 
subsequent extinctions taking place in the desired order by time t. Taking the 
limit t ^ 00, this boundary condition implies 

Qm,... ,2(2:1,0, • • • , Xm-1,0, 0) = QM-l,M-2,...,2{xifl, . . . , Xm-1,o) ■ (US) 

We have chosen this particular order of extinctions to demonstrate the result as 
it corresponds to the ordering implied in our definition of u. In the u variables, 
the backward Kolmogorov operator reads 

We conjecture a solution 

Qm,M-1,...,2(^o) = UiflU2fl---UM-l,0 ■ (121) 

Clearly, £^ annihilates this product, so it is a stationary solution of (118). 
The boundary condition that Qm,m-i,. ..,2(210) = ^i,o = 0, i < M is also 
obviously satisfied. Furthermore, if xm,q = 0, 



XM-1,0 
1 _ Y-A/-2 



mm-1,0 = ... = 1 • (122) 
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Hence the recursion (119) is satisfied. It is easily established that (52(a^i,0) 2:2,0) = 
Xi by finding the stationary solution of the backward Fokker-Planck equation 
(118) with M = 2 and imposing the boundary conditions (^2(1,0) = 1 and 
(^2(0, 1) = 0. Therefore by induction, (121) is the solution required. Rewriting 
in terms of the x variables, 

^ / \ _ ^2,0 3:3,0 Xm-1,0 

VM,Af-l,...,2li^oJ — 3^1,0 



1 ^ 2:1^0 1 ~ ^1,0 ~ 2:2,0 1 — Xifi — X2,o — • ■ ■ — a:M_2,o 

(123) 

The probability for an arbitrary sequence of extinctions: allele ii to go extinct 
first, followed by 12, is, ■ ■ ■ ,iM-i leaving only allele iM can be determined by 
permuting the indices in (123). This then gives us (45). 



4-2 Stationary distribution 



We earlier obtained the time- dependent pdf valid when mutation rates are 
nonzero by imposing refiecting boundary conditions. These have the effect of 
preventing currents at the boundaries, which in turn ensure that the limiting 
t — >■ 00 solution is nontrivial and hence the complete stationary distribution. 
One can check that this is the case from (29)-(38). When all the integers k are 
zero, the eigenvalue in (30) is also zero, indicating a stationary solution. The 
remaining eigenvalues are all positive, which relate to exponentially decaying 
contributions to the pdf. Retaining then only the term with A = in (29) we 
find immediately that 

It is easy to check that this distribution is properly normalised over the hy- 
percube < Ui < 1, i = 1,...,M— 1. To change variable back to the al- 
lele frequencies Xi we use the transformation (16), and using the fact that 
Ri = rrii + Ri+i, Rm = ttT'M and xm = 1 — J2iii ^i, we arrive at the result 
quoted earlier, (47). 

When the mutation process is suppressed, one finds from (21) that all the 
eigenvalues are nonzero, indicating that the distribution we have derived is 
zero everywhere in the limit t 00. This means that the stationary solution 
comprises the accumulation of probability at boundary points, as stated in 
Section 2.4. 
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4-3 Moments of the distribution 



One way to obtain the moments is to perform averages over the distribution. 
The mean and variance when mutation is present can easily be calculated in 
this way from (26). Calculating the mean directly from the explicit formula for 
the probability distribution (17) when fixation can occur is tricky because one 
must include the full, time-dependent formula for the fixation probability at 
the right boundary and the integrals one has to evaluate are not particularly 
convenient. 

Alternatively, we can exploit the Kolmogorov equation written in the form 
of a continuity equation (87). This leads to a differential equation for each 
moment which depends on lower moments. The first few moments can then 
be found iteratively in a relatively straightforward way. 

We demonstrate the method by giving the derivation of the moments of the 
distribution when two alleles are initially present. We then show that this 
method extends in a straightforward way to the case when M alleles are 
present. 

Specifically for M = 2 we have 

dP{x,t) , dJjx.t) ^ 

~^r~ + - " ^^^^^ 

where the current 

1 d 

J(x, t) = (nil - Rx)P(x, t) - -7^x(l - x)P(x, t) (126) 

2 ox 

and where R = mi + m2. 

When mutations in either direction are present (i.e., both mi > and m2 > 0) 
the mean of x'' is given by the expression 

{x\t)) = f\^P(x,t\xo,0)dx (127) 
Jo 

and so 



d 

— / x''P(x,t\xo,0)dx = - / x'' — J(x,t)dx 

ot Jo Jo ox 



k I x^ V(x,t)dx- x^J{x,t) 



x=0 



(128) 



where we have used the continuity equation (125) and integrated by parts. 
We have already stated that there is no current of probability through the 
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boundaries. In other words, here 

J(0,t) = J(l,t) = (129) 
and so the last term in (128) is zero. 

When fixation is a possibihty, one does have a current at the boundaries and, 
the probabihty that allele ai is fixed at time t being /i(a;o, t). Since the function 
P(x,t|a;o,0) excludes contributions from the boundary in this case, we must 
add these explicitly into the mean of . That is, 

{x\t)) = ■ /2(xo, t) + /' x^P(x, t|xo, 0)dx + 1 ■ /i(xo, t) . (130) 

Jo 

Taking the derivative of this expression and carrying out the integration by 
parts as in (128), we obtain 

^ = * *)dx - [.'Jix.tML, + . (131) 

These last two terms cancel, and so in either case we are left with 

d{x^{t)) '•I 



dt 



k f x''~^J(x,t)dx . (132) 
Jo 



Inserting the expression (126) for the current we find 



g(x^-(t)) 
di 



k 



rrii + 



k-1 



{x''\t)) 



R 



k 



{x\t)) 



(133) 



This reveals that moments of the distribution can be calculated iteratively. 
This equation is valid whether or not the are non-zero. The equation for 
the mean [k = 1) can be solved directly, and the results used to find (x^) 
{k = 2) and so on. 

When there are more than 2 alleles, a similar derivation gives the equation for 
the general moment {xi^X2^ ...x^'"^): 



dt 



Y^kA[m, + -ih-l)]{x^^-'Y[x'/) 



[l{j:k-l) + R]{Ylxh^ . (134) 



Thus, again we can calculate any moment by iteration. For example (xiXj), 
with i j, obeys the equation 



djxjXj) 
dt 



mj{xi) + mi{xj) — {2R + l){xiXj) 



(135) 
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and using (50) we find the result given in (54). 



5 Discussion 

In the last decade or two the ideas and concepts of population genetics have 
been carried over to a number of different fields: optimisation [49], economics 
[50], language change [51], among others. While the essentials of the subject 
such as the existence of alleles, and their mutation, selection and drift are usu- 
ally present in these novel applications, other aspects may not have analogues 
in population genetics. Furthermore, phenomena such as epistasis, linkage dis- 
equilibrium, the relationship between phenotypes and genotypes, which form 
a large part of the subject of population genetics, may be unimportant or 
irrelevant in these applications. Our motivation for the work presented here 
has its roots in the mathematical modelling of a model of language change 
where it is quite plausible that the number of variants (alleles) is much larger 
than two. It was this which led us to systematically analyse the diffusion ap- 
proximation applied to multi-allehc processes, having noted that many of the 
treatments in the context of biological evolution tend to be couched in terms 
of a pair of alleles, the "wild type" and the "mutant" . 

In this paper we have shown how the Kolmogorov equation describing the 
stochastic process of genetic drift or the dynamics of mutation at a single 
locus may be solved exactly by separation of variables in which the number of 
alleles is arbitrary. The key to finding separable solutions is, of course, to find 
a transformation to a coordinate system where the equation is separable. The 
change of variable (10) we used is similar, but slightly different to one suggested 
by Kimura [12] which he showed achieved separability up to and including 4 
alleles. Kimura was of the opinion that novel mathematical techniques would 
be needed to proceed to the general case of M alleles. What we have shown 
here is that with our change of variables this generalisation is possible without 
the need to invoke any more mathematical machinery than was required in 
the standard textbook case of 2 alleles. A large part of the reason for this 
is the way that the problem with (M -|- 1) alleles can be constructed in a 
straightforward way from the problem with M alleles and that with 2 alleles. 
This simple "nesting" of the M-allele problem in the (M -|- l)-allele one is 
responsible for many of the simplifications in the analysis and is at the heart 
of why the solution in the general case can be relatively simply presented. 

An illustration of the simple structure of this nesting is the fact that the 
general solutions, valid for an arbitrary number of alleles, with or without 
mutation, is made up of products of polynomials — more specifically Jacobi 
polynomials. Although the higher order versions of these polynomials can be 
quite complex, even after relatively short times only those characterised by 
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small integers are important. As mentioned in Section 2.2, we have given the 
solutions in terms of the transformed variables, but their form in terms of the 
original variables of the problem can be found using Eqs. (10) and (16). We 
have also presented the derivations of many other quantities of interest. In the 
interests of conciseness we have given only a flavour of some of these: some are 
new, some have already been derived by other means and yet others are simple 
generalisations of the two-allele results. Yet other results are more naturally 
studied in the context of the backward Kolmogorov equation, and we also 
took the opportunity to gather together the most significant, but nevertheless 
little-known, ones here. 

We have thus provided a rather complete description of genetic drift and mu- 
tation at a single locus. Nevertheless, there are some outstanding questions. 
For example, as discussed in Appendix B, the transformation (10) does not 
render the Kolmogorov equation separable for an arbitrary mutation matrix 
rriij, only one where that rate of mutation of the Aj alleles to Ai alleles {i ^ j) 
occurs at a rate independent of j. This is the reason for this simplified choice 
for the mutation matrix — a choice also made in all other research we are aware 
of. It might seem possible in principle to find another transformation which 
would allow a different form of rriij to be studied, but this seems difficult for 
a number of reasons. For instance, the transformation must still ensure that 
the diffusion part of the Kolmogorov equation is separable. Furthermore, the 
matrix has (M — 1)^ entries, and the transformation only (M — 1) degrees 
of freedom so for only certain, restricted, forms of rriij will a transformation 
to a separable equation be possible. Other questions involve the study of se- 
lection mechanisms or interactions between loci using the results here as a 
basis on which to build. We are currently investigating these questions in the 
context of language models, but we hope that the work reported in this pa- 
per will encourage further investigations along these lines among population 
geneticists. 
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A Boundary conditions in the absence of mutations 

As we have mentioned several times in the main text, although it might be 
thought that adding the possibility that mutations occur would complicate 
the problem, in many ways it is the situation of pure random mating that 
is the richer mathematically. So, for example, the case with mutations has 
a conventional stationary pdf given by Eq. (47), whereas without mutations 
the stationary pdf is a singular function which exists only on some of the 
boundaries. Of course, it is clear from the nature of the system being modelled 
that this has to be the case, but what interests us in this Appendix is the nature 
of the boundary conditions which have to be imposed on the eigenfunctions of 
the Kolmogorov equation (3) to obtain the correct mathematical description. 

The nature of the boundary conditions required when fixation can occur are 
discussed in the literature both from the standpoint of the Kolmogorov equa- 
tion in general [42,52,41] and in the specific context of genetic drift [39,53]. 
Here we will describe a more direct approach, which while not so general as 
many of these discussions, is easily understood and illustrates the essential 
points in an explicit way. It is sufficient to discuss the one-dimensional (two 
allele) case, since all the points we wish to make can be made with reference 
to this problem. 

The question then, can be simply put: what boundary conditions should be 
imposed on Eq. (1)? On the one hand one might feel that that these should 
be absorbing boundary conditions [41,40], because once a boundary is reached 
there should be no chance of re-entering the region < x < 1. On the other 
hand, however, the diffusion coefficient "naturally" vanishes on the boundaries 
which automatically guarantees absorption, so there would seem to be no need 
to further impose a vanishing pdf (as an absorbing boundary condition would 
require). In that case, what boundary conditions should be imposed? Also is 
it even clear, given the fact that the diffusion coefficient x{l — x)/2 becomes 
vanishingly small as the boundaries are approached, that the boundaries can 
be reached in finite time? 

To address these questions let us begin with the Kolmogorov equation (5) 
which includes a deterministic component A{x) as well as a random component 
represented by the diffusion coefficient D{x). Although we are interested in 
the case when A{x) is not present, it will help in the interpretation of the 
results if we initially include it. We also make two notational changes — we 
put a subscript / on A{x) which will be explained below and we will write 
D{x) = g'^{x)/2. The function g{x) is real, since D{x) > 0. Therefore our 
starting equation is 
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It turns out that it will more useful for our purposes to write this in the form 



dP 

'dt 



d_ 
dx 

d 



where 



As{x) = Ai{x) -]^g{x)^ 



(A.2) 



(A.3) 



Equations (A.l) and (A.2) are known as the Ito and Stratonovich forms of the 
Kolmogorov equation respectively [41,40]. There is no need for us to explore 
the differences between these two formulations; the only relevant point here is 
that the Stratonovich form is more convenient for our purposes. 

We now transform Eq. (A.2) by introducing the pdf Q(?/, t|?/0; 0) which is a 
function of the new variable y defined by 



dx 
9{x) 



The transformed equation reads 

dQ__ d_ 
dt dy 



Q = P^ = Pg{x) 

dy 



(A.4) 



A{y)Q 



where 



As{x) Ai{x) 



2 9y2 ' 
1 dg{x) 



(A.5) 



(A.6) 



g{x) g{x) 2 dx 
The system with an x-dependent diffusion coefficient has now been trans- 
formed to one with a state-independent diffusion coefficient, at the cost of 
adding an additional factor to the deterministic term. 



The problem of interest to us has Ai(x) = and g{x) = ^Jx{l — x). From 
Eq. (A.4) we find the required transformation to be x = sin^(y/2) or cosy = 
1 — 2x with < y < TT, and from Eq. (A.6) we find that A = — (l/2)cot|/. One 
way to understand this process intuitively is through a mechanical analogy: if 
we set A = then the Kolmogorov equation can be thought of as describ- 
ing the motion of an overdamped particle in the one-dimensional potential 
^{y) = ~ A{y)dy = (l/2)lnsin?/, subject to white noise with zero mean 
and unit strength. An examination of Fig. A.l shows that the boundaries are 
reached in a finite time. More importantly, from the relation Q = gP we see 
that imposing the absorbing boundary conditions Q{0,t) = Q{n,t) = only 
implies that the pdf P must diverge with a power smaller than 1/2 at the 
boundaries x = and x = 1. 
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Fig. A.l. Left: Potential V{y) in the actual genetic drift system. Right: Potential 
V{y) in the hypothetical system described in the text. 

These results should be contrasted with the hypothetical situation where 
Aj{x) = 0, but g{x) = x(l —x). In this situation we have that x = [1 + e^^]''^ 
or y = In (x/(l — x)) with — oo < y < oo, and from Eq. (A. 6) we find that 
A = (l/2)tanh(y/2). This corresponds to a process in which an overdamped 
particle is moving in the potential V{y) = — lncosh(i//2) and subject to white 
noise. For large \y\, V{y) ~ — \y\/2, which indicates that it will take an infinite 
time to reach the boundaries. This potential is shown on the right-hand side 
of Fig. A.l. Considerations such as these complement more mathematically 
rigorous studies by giving insights into the nature of the processes involved 
when the diffusion coefficient is dependent on the state of the system. 



B Coordinate system in which the Kolmogorov equation is sepa- 
rable 



The transformation from the coordinate system defined in terms of the set of 
allele frequencies, Xj, to that denoted by Ui, for which the Kolmogorov equa- 
tion is separable, is given by Eq. (10), with the inverse transformation given 
by Eq. (11). In this Appendix we explore the nature of this transformation, 
determine the Jacobian of the transformation and calculate how the jump mo- 
ments in the new coordinate system are related to those in the old one. This 
enables us to derive the Kolmogorov equation in the new variables — given by 
Eqs. (12) and (13) in the main text. We begin by looking at the nature of the 
transformation for low values of M. 

M = 2 : < Xi < 1. 

ui = xi =^ < ui < 1. 
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M = 3 : < Xi,X2 < 1. Since X3 = 1 
then Xi + X2 < 1. 



Ml = Xi 



U2 



Xi — X2 also satisfies this condition, 

X2 



1 — Xi 

Therefore, < ui,U2 < 1, with the three lines xi = (0 < X2 < 1) , 2:2 = 
(0 < Xi < 1) and Xi + X2 = 1 (0 < Xi < 1) go over to the three lines 
m = (0 < n2 < 1),U2 = (0 < ni < 1) and ^2 = 1 (0 < Ui < 1) 
respectively. The point xi = 1, X2 = goes over to the line Ui = 1 (0 < ^2 < 1). 
This is illustrated in Fig. B.l 




Fig. B.l. Schematic representation of the mapping of points in the xi — X2 plane 
onto the ui — U2 plane. 





" 1 



Fig. B.2. The range of values that can be taken by Xi and Uj in a 4 allele system. 
The points A, A', A" and A'" on the right all map onto the point A on the left. 
Similarly, B and B' both map onto B. 



M = 4 : < Xi, X2, X3 < 1 with the additional condition Xi + X2 + X3 < 1. 

X2 X3 



Ui = Xi 



U2 



1 — Xi 



^3 



1 - Xi - X2 



Therefore, < Ui,U2,u^ < 1. The planes x, = get mapped on to the planes 
Ui = {i = 1, 2, 3), the plane Xi + X2 + X3 = 1, Xi + X2 7^ 1 gets mapped on 
to the plane U3 = 1, < ui,U2 < 1, the line xi + X2 + X3 = 1, xi + X2 = 
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1, Xi 7^ 1 gets mapped to the plane M2 = 1, < M3 < 1, < Mi < 1 and 
the point Xi + X2 + X3 = 1, Xi + X2 = 1, Xi = 1 gets mapped to the plane 
^1 = 1, < ^2, U3 < 1. This is illustrated in Fig. B.2. 



General M : < Xi, . . . , Xa/-i < 1 with the additional condition Xi + . . . + 
XM-1 < 1- 

From Eqs. (10) and (11) we see that < Mj < 1, i = 1, . . . , M — 1. The pattern 
from the M = 3 and M = 4 cases should now be clear: the hyperplanes Xj = 
get mapped on to the hyperplanes Ui = (z = 1, . . . , M — 1), the hyperplane 
xi + . . . + xm~i = 1, 7^ 0, gets mapped on to the hyperplane um-i = 

1, < ui, . . . , um~2 < 1, • • • , the point xi = 1, x, = 0, i = 2, . . . , M — 2 gets 
mapped to the hyperplane ui = 1, < < 1, z = 2, . . . , M — 2. 



So, in summary, the region < Xj < 1 lying between the origin and the 
hyperplane Xi + . . . + xm-i = 1, gets mapped into the unit hypercube < 
Ui < 1. However, parts of the hyperplane xi + . . . + xm-i = 1 of dimensions 
M— 1, M— 2, . . . , 1 get mapped on to the hyperplanes = 1, i = 1, . . . , M — 1. 

The Jacobian of the transformation from tt to x is easily found if we note from 
(11) that dxi/duj = ii j > i. This implies that the Jacobian has zeros if 
the column label is greater than the row label, and so the determinant is just 
the product of the diagonal elements. Since the ith diagonal element is simply 
Xi/ui, we have that 

d{xi, . . . ,Xm-i) X1X2 Xm-1 f-. \M-2 /. nM-3 n 

— = . . . = (1 - Ui) {I-U2) ... (1 - Um~2) ■ 

(J[Ui, . . . ,Um-1) U1U2 Um^I 

(B.l) 

As explained in Section 2.2, the pdfs in the u variables, denoted by P, are 
related to those in the x variables, denoted by P, by a multiplicative factor 
which is simply the Jacobian: 



P(M,t)=P(x,t) ^;""--""^-^j , 

T'{u,t2;v,ti)=P{x,t2;y,ti) — -— -, ... , 

d{ui, um-i) o{vi, vm-i) 

from which (14) follows. 

The final point which we wish to discuss in this Appendix is the form of 
the Kolmogorov equation in the new u variables. As discussed in Section 3, 
the easiest way to determine this is to relate the jump moments in the old 
coordinate system, defined through Eq. (61), and those in the new coordinate 
system, defined through Eq. (63). To do this, let us begin with the first jump 
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moment: 



(AM,(t))^(,)^„ = I dv! iu[ - u,) P(m', t + At\u, t) 

= J dx' ifiix!) - Mx)) P{x', t + At\x, t) , 

where we have used (14) and where Ui = fi{x) is the transformation (10). Use 
of Taylor's theorem then gives 



(Au.(t))„(._ = E(A^.W). 



:{t)=X 



+ . . . . 



j k 



Since jump moments higher than the second vanish in the hmit At 
find from Eq. (B.2) that 



Similarly, 



j ^^3 j k ^-^3 



ki dxi 



dxjdxk 



(B.2) 
0, we 

(B.3) 
(B.4) 



The results (B.3) and (B.4) tell us how the jump moments transform from one 
coordinate system to another. In our case, we have from Eq. (10) 



duj 
dxj 



u'^i/xi, if j < i 
Ui/xi, ifj = i 
0, if j > z , 



(B.5) 



and 



dxkdxi 



(B.6) 



2ul/xj, a kj < i 

uf/xf, ii {k = i,l < i) ot {I = i,k < i) 

0, ifk = l = i 

0, if k > i or I > i . 

The key feature of the two results (B.5) and (B.6) is that they only depend on 
i, and not on j, k or /. This simplifies the summations which appear in (B.3) 
and (B.4). A still fairly tedious calculation shows that the second term on the 
right-hand side of (B.3) is zero and that the terms with i 7^ j in (B.4) are also 
zero. Specifically, 
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aAu 



aij[u) 



duj 
dxj 



duj 

det 



dt 



det 



Ui{l - Uj) 



(B.7) 
(B.8) 



In this paper the only deterministic mechanism which we will consider will be 
mutation, described by the differential equation 



(B.9) 



where rriij is the rate of mutation of allele j into allele i. From Eqs. (B.5) and 
(B.7) 

Ui 



(B.IO) 



We will denote the term in the brackets in Eq. (B.IO) by /3i{u), so that 

A(m) 



nj<i(l - Uj 



Using Eqs. (62) and (B.8), we find the Kolmogorov equation in the new coor- 
dinate system to be 



dV 



dt UJ<^il-U 



E 



d 



dui 



1 92 



[u,{l-u,)V] \ . (B.ll) 



In Section 3 we prove that the Kolmogorov equation is separable in the u 
coordinate system if Pi{u) depends on Ui only, and not on the uj, j ^ i. Since 
/3j involves a sum over the expression (B.9), which has itself to be written in 
terms of the m using Eq. (11), it is clear that this will not be the case for 
general m^. However, if it is assumed that the rate of mutation of the alleles 
Aj (j 7^ i) to Ai occurs at a constant rate m,, independent of j, then 
for all j ^ i and (B.9) becomes 



M 



M 



^ = mi (1 - Xi{t)) - [E"^ij ^»(^) = "^i - (^E"^i I ^»(^) • (B.12) 



In this case 



i-1 

Pi = Ui'Y "^i ~ -^""j E + "^j ~ R^i 
j=l j<i 

i-1 

= Mj ^ rrij + R{xi — Ui) + rrii — Rxi 
i=i 
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depends on Ui and that 



where R = Y^jLi and where we have used Eq. (11). We see that I3i{u) only 



nj<i(l - Uj) 

where Ri = Y^fLi f^j ■ This leads to the Kolmogorov equation 

(B.14) 

This is the content of Eqs. (12) and (13) in the main text. 

C Explicit solution of the second order differential equations 



In this Appendix we will provide further details relating to the solution of 
the equations (71) and (75) subject to the appropriate boundary conditions. 
Some of the details have already been given in Section 3, in particular how 
the substitution (78) leads to the hypergeometric equation (81). Consequently, 
much of this Appendix will be concerned with the hypergeometric functions 
F(a, 6; c; u) which are the solutions of this equation. We will refer to Chapter 
15 of the standard handbook by Abramowitz and Stegun [46] for the formulae 
that we use, but several appear sufficiently often that it is worth stating them 
explicitly here: 

1. lfc-a-6>0 and c 7^ 0, -1, -2, . . ., then 

V{c)V{c-a-b) 



F{a, b; c; 1) 



T{c-a)T{c-b) ' 



2. F{a, b; c, u) = (1 — u)^^'""'^^ F {a' , b'; c; u) where a' = c — a and b' = c — b. 
Note that c — a' — b' = —{c — a — b). 

3. If a or 6 is a non-positive integer then the power series for F{a, b; c; u) 
terminates: 

F{-n, a + l + (3 + n;a + l;u) = ^^^^j^^^^^PjT'^^ (1 - 2u) , 

r(n + a + 1) 

where P^'^"^ is a Jacobi Polynomial of order n. 

As we will see, the solution of the hypergeometric differential equation typi- 
cally proceeds as follows. The general solution to the equation (81) is written 
out as f{u) = Afi{u) + Bf2{u), where fi{u) and /2(m) are two independent 
solutions of the equation, and A and B are two arbitrary constants. Exam- 
ining the general solution in the vicinity of the boundary at u = rules out 
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one of the solutions, so that, for instance one must take i? = in order to 
satisfy the boundary condition there. The remaining solution is now examined 
in the vicinity of the other boundary at m = 1. If the solution has the form 
F(a, 6; c; u) with c — d — b > then we may use Result 1 above. If c — a — 6 < 0, 
then we may use Result 2 followed by Result 1. If it is found that either d or 
6 is a non-positive integer, then Result 3 may be invoked. 

We now separately examine the specific equations encountered in the absence 
and presence of mutations. 



C.l No mutations 



The values of the parameters are given in Section 3.3.2, where it is seen from 
Eq. (83) that c = 2. One of the solutions of the hypergeometric equation when 
c = 2 (suppose it is /2) has the following behaviour for small u [46]: 

f2{u) ^ — , asu^O, (C.l) 

(1 — a)(l — b)u 

and so this solution would not lead to a finite pdf as m — * — recall from Ap- 
pendix A that only divergences less strong than will be tolerated. There- 
fore 5 = 0. The remaining solution is fi{u) = F{a,b;2;u). From Eq. (83), 
2 — a — b = —l — 2'ji, so if we choose 7^ > 0, as described in Section 3.3.2, we 
may use Result 2 above to show that 

f-^{u) = {1 - u)~^^'-^ F{2 - a,2 - b;2;u) . 

Now 2 — (2 — a) — (2 — 6) = a + b — 2 = 27^ + 1 > and so we may use Result 
1 to show that 



, , , N-27i-i r(27i + 1) 
/iM ~ 1-M — — — — , asM 

Tar 6 



r(27. + 1) 



=> ^(n)~(l-n)-^-^^^^-^y^, as^^l. (C.2) 

We may once again use the result of Appendix A to deduce that this term 
cannot be present since it is diverging too fast as m — ^ 1 (the analysis of 
Appendix A can be straightforwardly generalised to cover the equations that 
are found from separation of the M allele Kolmogorov equation). We therefore 
require that either a or 6 equals — /, where / is a non-negative integer. Since 
the hypergeometric function is symmetric in a and b it does not matter which 
we choose: let a = —I. From Result 3 this means that the hypergeometric 
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function is terminating, and more specifically, 

Mu) = F{-1, 27. + / + 3; 2; u) = j^P^^''"'^+'\l - 2u) . (C.3) 
As shown in Section 3.3.2, 7^ = if we take 7j > 0, and so up to a constant 

We discuss some properties of Jacobi Polynomials below. 

Note that if we had made the choice 7^ < in Section 3.3.2 {i.e. 7^ = — — 
1), then the argument given above would still rule out solution /2(m) using 
Eq. (C.l), but we would have found the remaining solution to be fiiu) = 
F{a', b'; 2; u) where a' and b' are the parameter solutions relevant for the choice 
7j < and are related to those for the former 7j > choice by a' = 2 — a 
and 6' = 2 — 6. This can be easily seen from Eq. (83): if 7^ is one solution 
of 7i(7i + 1) = 2A', then a second one is 7- = — 1 — 7^. But then Eq. (83) is 
unchanged if 7^, a and b are replaced by 7^', a' = 2 — a and b' = 2 — b. So for 
the choice 7' = — Lj-i — 1 we have shown that 

^{u) = (1 - uy"^'-^-' F(2-a, 2-6; 2; n) ~ (1 - n)'^-^ TT^FT^ ' as n ^ 1 , 

r(a)r(6) 

where the 7^, a and b are those for the 7j > choice. This is exactly Eq. (C.2) 
and so we again deduce that a = —I, where / is a non-negative integer. Using 
Result 2 we then have that 



fi{u) = (1 - uf"''^^ F{-1, 2L,_i + l + 3;2;u) 
^ ^,,,,-.)(n) = (1 - u)'-^ p(i'27.+i)(i _ , 

which is Eq. (C.4). 

The final step in calculating the solutions is to apply the initial conditions to 
find the appropriate coefficients. We use the appropriate orthogonality rela- 
tionship (which we shall give below) and use the method described in (66)- (68), 
and find w{uq) and ci^{Li) as stated in (19) and (23). 



C.2 Mutations present 



The values of the parameters are given in Section 3, where it is seen from 
Eq. (82) that c = 2(1 — mi). The nature of the solutions of Eq. (81) depends 
on whether c is an integer, and if it is, what its value is [46]. The simplest case 
is if c is not an integer, then the general solution has the form 

AF{a, b; c; u) + Bu^~^F{a - c + 1, 6 - c + 1; 2 - c; n) . (C.5) 
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Both of these hypergeometric functions are of the form F{a, b; c, u) with c— a — 
b > —1 and so have power-series expansions near u = [46]. Substituting the 
expression (C.5) into the boundary condition (90) gives A = 0. The analysis 
when c = 1 and c = 0,-1, —2, . . . has to be done separately. One of the 
solutions has a factor of Inu in these cases, but it turns out that, while these 
solutions have a current which does not diverge at n = 0, the current is non- 
zero, which is forbidden by the reflecting boundary conditions. Therefore this 
solution must be absent. It is found that in all cases the implementation of 
the boundary condition at m = shows that the required solution is f{u) = 
u^~^F{a - c + 1, 6 - c + 1; 2 - c; m). 

We now note that the conditions which determine the parameters a and b 
from Eq. (82) can be simplified if we introduce a new parameter ki through 
a = 1 + '-^i + ki. Then 6 = 7^ + 2 — 2Ri — ki, where ki{2Ri + ki — 1) = 2X. 
So choosing 71 = — L, this being one of the solutions of Eq. (93), we have 
that c = 2(1 — mi), a = 1 + ki — L and 6 = 2 — 2Ri — ki — L and therefore 
f(u) = n2'"i"iF(2mi + ki - L,l - R2 - ki - L; 2mi; u). Using Resuh 2 we 
may write this in an alternative form: 

/i(m) = M^^i-i (1 _ m)2(^+^2)"1 f{L -ki,ki + L + 2Ri - 1; 2mi; u) . (C.6) 



We may now apply the boundary condition that the current vanishes when 
u = 1, which leads to Eq. (90). After some algebra we find that this condition 
implies 



= lim |n2"i(l -u)-^} {-LF(2mi + ki - L,l - R2 - ki - L; 2mi; u) 
[L - ki){2Ri + ki + L - 1] 



2mi 



F(2mi + ki - L, 1 - R2 - ki - L;2mi + 1 



We may now use Result 1 to show that for L > 1 the term in the second set 
of curly brackets has a finite limit as m — 1, specifically this term tends to 

(2i?2 + L- l)r(2mi)r(2i?2 + 2L - 1) 



T{L-ki)T{2Ri + ki + L-l) 

Thus either L — ki or 2Ri + ki + L — l must be a non-positive integer if we are 
to obtain a finite current (let alone a zero-current) as w — >■ 1. The case L = 
may be treated separately, since the first term in the bracket, to which Result 
1 does not immediately apply is absent. The same result is found. To show 
that these conditions also imply a zero current, we may substitute ki = L + 1 
or /ci = 1 — 2Ri — L — I into Eq. (C.6) and find, when ki + L + 1 for instance 
that 
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= lim {u"^""' (1 - {-LF{-1, 2i?i + 2L - 1 + /; 2mi; u) 

U—^l >- J 

+ (l-u)^ Fi-l, 2Ri + 2L-l + /; 2mi + l;u) \ . (C.7) 
du J 

Since by Result 3 the two hypergeometric functions are polynomials of order /, 
the right-hand side is indeed zero if L > 1, and also when L = 0, since the first 
term in the second curly bracket of Eq. (C.7) is absent. Note that if we had 
made the choice ki = 1 — 2Ri — L — l then the hypergeometric functions which 
would have appeared would have been of the form F(2_Ri+/ + l + /, — /; 2mi; u), 
which again are polynomials by Result 3. 

Therefore, returning to Eq. (C.6), we have found that 

= u^^i-i (1 - uf(L+R2)-i 2(L + Ri)-l + l; 2mi; u) . 

Using Result 3 this is, up to a constant, 

AH = n°(l-nfP^^)(l-2n), (C.8) 

where a = 2mi — 1 and (3 = 2(i?2 + L) — 1. Writing it in terms of the function 
ip we have that 

It is also straightforward to check that if we made the choice 71 = 2i?2 + L—1, 
rather than 71 = —L, then the analysis near u = would be as before, but 
then use of Result 2 would then show that the result we obtain was exactly 
as before (but with a and b interchanged). 

In summary, we have shown that ki = L + l, where / is a non-negative integer. 
Thus for M = 2, when 71 = L = 0, and so from 2A = ki{2Ri + ki-l), 
we obtain Eq. (91). For M = 3, we use Eq. (92), with L = li and so obtain 

A(2) 

given by 

A(2) = lL'(2i?i + L'-l) , (C.IO) 

where L' = L + I, and / is a non-negative integer and where Ri = J2^=i^j- 
Continuing in this way we find the result given by Eq. (94). 

We again use the orthogonality properties of the Jacobi Polynomials as de- 
scribed in (66)-(68) to define q. in (33). 

C.3 Jacobi Polynomials 

All the solutions for the pdfs considered in this paper consist of Jacobi Poly- 
nomial p/"'^^(l — 2u). All properties of these functions that we require are 
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given in Chapter 22 of Abramowitz and Stegun [46], but here we will list the 
key results that we use. 



The first point to note is that the natural variable for these functions is 2 = 1 — 
2u. Thus while u e [0, 1], z G [—1, 1]. They satisfy the following orthogonality 
relation in the u variable: 

/o, o r(/ + i)r(/ + a + /5 + i) 







dMM"(l - m)^p/"'''Hi - 2M)P;h^)(l - 2u) = 6i^v , (C.ll) 



where a ^(3 > —1. In the eigenf unctions calculated when mutations are present, 
the functions which make up the right eigenfunction $a as defined in 
Eq. (31), contain the factor u"(l — u)^ which appears in Eq. (C.ll), as can 
be seen in Eq. (32). On the other hand the functions 9^''^ which make up the 
left eigenfunction Qx as defined in Eq. (37), do not contain this factor, as can 
be seen in Eq. (38). The result is that the right and left eigenfunctions satisfy 
the simple orthogonality relation / duQ\{u)^y{u) = 5a,a'- 

Secondly, Kimura uses the notation of a J polynomial, which relates to the 
Jacobi polynomial as follows: 

Ua + 2, a + 1, 1 - X) = ^p,^!^^^!|/V ir^i^'"ni - 2x) . (C.12) 

T{n + a + 1) 



In the derivation of the fixation probability In Section 2.4.1 we make use of the 
following relationship between Jacobi and Legendre polynomials which can be 
found in Ref. [46]: 

(1 - z')P^'^'\z) = 1^ mz) - zP,^,{z)] , (C.13) 

as well as this identity for Legendre Polynomials, which can be found in 
Ref. [54]: 

zPi+^{z) - Pi{z) = ii±4[P/+2(^) - Pi{z)] . (C.14) 
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